Image processing method and system

ABSTRACT

The invention provides an image processing method and system wherein an image is conceptually textured onto the surface of a three dimensional shape via a projection thereonto. The shape and/or the image position are then moved relative to each other, preferably by a rotation about one or more axes of the shape, and a second projection taken of the textured surface back to the image position to obtain a second, processed image. The view displayed within the processed image will be seen to have undergone an aspect ratio change as a result of the processing. The invention is of particular use in simulating the small movements of humans when speaking, and in particular of processing viseme images to simulate such small movements when displayed as a sequence.

TECHNICAL FIELD

The present invention relates to an image processing method and system,to a computer program arranged to cause a computer to perform themethod, and to a computer readable storage medium storing said computerprogram.

BACKGROUND TO THE INVENTION AND PRIOR ART

It is known that the delivery of synthesised or recorded speech messagescan be enhanced by the use of an animated picture of the sender, or bydisplaying at least the head part of an avatar created of the sender,but in both cases in which only the lips move in synchrony with thereproduced speech. Where a picture of the sender is used, the impressionof movement of the lips is created by displaying what is known as a“viseme”, which is an image of a human face (for example of the messagesender) provided with the lips thereof in one of a number ofidentifiable shapes, which each represent a lip shape associated withone or more phonemes. Phonemes are, of course, well known in the art andare the individual discrete sounds which are used within a language. Itis estimated that there are approximately 44 phonemes in the Englishlanguage, but perhaps only as few as twenty or so visemes. Therefore, itis possible to display the same viseme when reproducing one of severalphonemes.

In operation, a speech reproducer such as a speech synthesiser outputsacoustic waveforms corresponding to a sequence of phonemes, and at thesame time a display means displays to the user the appropriate visemeassociated with the particular phoneme which has been reproduced at anyparticular time. Therefore, the user obtains the illusion of an image ofthe sender whose lips appear to move in synchrony with the reproducedspeech. It should be noted that here the visemes are two dimensionalimages of the sender.

The alternative method known in the prior art, as mentioned above, is toproduce either a whole body avatar, or at least a three dimensionalvirtual model of the sender's head, which is then shaped and textured tolook like the sender. The lips of the head model can then be controlledto move in synchrony with the reproduced speech, such that the lips ofthe model assume the appropriate shape for the particular phoneme beingreproduced at any particular time. However such systems involve complexhead modelling using a virtual wire frame reshaped by difficult imageprocessing or invasive sensing, and requires a process in which a stillpicture is accurately conformed to the given model. It is thereforestill difficult to reproduce head models without undergoing invasivesensing or scanning of the person whose model is to be created, such as,for example, in a specialist avatar creation booth such as thoseprovided by Avatar-Me Ltd, a United Kingdom limited company no.03560745. Furthermore, once a 3D model has been obtained, thecomputation required to achieve the illusion of the model speaking to auser is high, and not presently suitable for implementation on mobiledevices, such as mobile telephones, personal digital assistants, or thelike.

The first of the aforementioned methods, being that of displaying asequence of two-dimensional visemes in synchrony with the reproducedspeech, does not suffer from the same computational intensity problemsas the second of the aforementioned methods, but does suffer from theproblem that the displayed image appears to be almost robotic to theviewer, in that they can appear stale, automated, and not life-like.This is because the only movement apparent to the viewer is the movementof the lips to create the appropriate viseme shape corresponding to thepresent phoneme being reproduced. However, such movement does notcorrespond to the natural movement of a human being while talking, as ithas been observed that most human beings also make very small headmovements at the same time as speaking (see ‘Autonomous secondary gazebehaviour, M Gillies, N Dodgeson & D Ballin, Proceedings of the AISB2002symposium on Animating Expressive Characters for Social Interactions,ISBN 1902956256’), but such head movements are difficult to recreateartificially Whilst it would be possible to modify the second of theaforementioned methods (i.e. that of the 3D avatar model) to cause themodel to move slightly in accordance with the observed human behaviour,such movement of course brings with it the same problems of highcomputational intensity as already discussed. In order to get aroundthis problem it would therefore be advantageous if the first of theaforementioned methods (i.e. the two-dimensional viseme method) could bemodified to reproduce the observed behaviour.

SUMMARY OF THE INVENTION

The present invention addresses the above problem by providing an imageprocessing method and system which is able to process thetwo-dimensional viseme images in order to produce processed images whichwhen displayed in sequence reproduced the observed small movements of ahuman head during speech. The image processing is achieved conceptuallyby texturing the head image to be processed onto the surface of a 3Dshape, which is preferably a 3D virtual shape provided within a virtualspace, and then moving the shape slightly to imitate the observed humanhead movements. Once the shape has been moved slightly a projection ofthe image from the surface of the shape back to the original imageposition is taken, which results in a second, processed, image, which isan image of a human head with a slight aspect ratio change. When asequence of viseme images are processed by the method in turn, and theresultant processed images subsequently displayed to a user in turn, theresult is that the observed random movements of a human head duringspeech are simulated.

It should be noted, however, that whilst the present invention has beendeveloped and is mainly described herein in the context of the problemdescribed in the introductory portion of simulating small human headmovements, the image processing method and system which achieves thisresult is not limited to this sole application, and may find applicationin broader fields such as, for example, the television special effectsindustry, computer modelling and mapping applications, or any otherfield where a two-dimensional image may need to be processed.

Therefore, in view of the above according to a first aspect of thepresent invention there is provided an image processing methodcomprising the steps of:

-   -   a) texturing at least one surface of a three-dimensional shape        with a projection of a first image from an image position and        orientation thereof onto said at least one surface;    -   b) moving one or both of the shape and/or the image position        relative to each other; and    -   c) projecting the textured surface of the shape to the image        position to obtain a second image at the position and in the        same orientation as said first image.

The present invention provides the advantage that an effective imageprocessing operation which reproduces a three dimensional aspect changeof the subject displayed in the image can be simulated. There is afurther advantage that as both the input and the output are twodimensional images, then computational intensity of an algorithmembodying the method is reduced.

Preferably, the first image forms part of a sequence of first images,the method further comprising repeating steps (a), (b), (c) for eachfirst image in said sequence to obtain a corresponding sequence ofsecond images. Thus, the invention can be applied to a sequence ofimages in turn in order to allow the same processing to be applied to an“animation” sequence.

In order to specifically address the problems of the prior art, therespective sequences of first and second images preferably each form ananimated sequence of a human head speaking.

Preferably, the moving step further comprises randomly moving the shapeand/or the image position. This provides the advantage that when theimages are of human heads, the observed human movements which thepreferred embodiment of the present invention is attempting to recreateare more accurately reproduced.

Preferably, the movement comprises rotating the three dimensional shapeof about one or more axes thereof. This provides the advantage that,when the images are images of human heads, the movement by the shapesimulates the movement possible of a human head attached to a pair ofshoulders.

From a second aspect, the present invention further provides an imageprocessing system comprising:

-   -   image receiving means for receiving a first image to be        processed;    -   image processing means; and    -   image output means for outputting a second, processed image;    -   characterised in that the image processing means further        comprises:    -   shape modelling means arranged to model a three-dimensional        shape;    -   and is further arranged to:    -   a) texture at least one surface of a three-dimensional shape        with a projection of a first image from an image position and        orientation thereof onto said at least one surface;    -   b) move one or both of the shape and/or the image position        relative to each other; and    -   c) project the textured surface of the shape to the image        position to obtain a second image at said position and in the        same orientation as said first image.

The second aspect possesses the same further features and advantages aspreviously described in respect of the first aspect.

From a third aspect there is also provided a computer program, arrangedsuch that when executed on a computer it causes the computer to performthe method of the first aspect of the invention.

From a fourth aspect, there is further provided a computer readablestorage medium storing a computer program according to the third aspect.Preferably, the computer readable storage medium may be any magnetic,optical, magneto-optical, solid-state, or other storage medium known inthe art, for example a hard disk, a portable disk, a CD Rom, a DVD, RAM,ROM, programmable ROM, tape, or the like. It should be noted that thelist of computer readable storage media given above is not exhaustive,and any known computer readable storage media may suffice.

Further features and advantages of the present invention can be found inthe appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

Further features and advantages of the present invention will becomeapparent from the following descriptions of embodiments thereof,presented by way of example only, and wherein like reference numeralsrefer to like parts, and wherein:

FIG. 1 is a block system diagram of an apparatus according to thepresent invention;

FIG. 2 is a block system diagram of the image processor of FIG. 1;

FIG. 3 is a flow diagram illustrating the steps performed by the systemof FIG. 1;

FIG. 4 is a flow diagram showing the steps performed by the imageprocessor of FIG. 2;

FIG. 5 is a diagram illustrating in wire frame model form the variousgeometric shapes which may be used in the embodiments of the presentinvention;

FIG. 6 is perspective view illustrating the operating concept behind theembodiments of the present invention;

FIG. 7 is a perspective view also showing the basic operating conceptbehind the present invention;

FIG. 8 is a perspective view providing for a mathematical analysis ofthe operation of the present invention;

FIG. 9 is an elevational view of the first image of the presentinvention;

FIG. 10 is a diagram representing the arrangement of FIG. 8 in planview;

FIG. 11 is a diagram showing the arrangement of FIG. 8 in plan view, butillustrating a movement of the cylinder;

FIG. 12 is a plan view of FIG. 8 also illustrating a movement of thecylinder;

FIG. 13 is also a plan view of the arrangement of FIG. 8 and alsoshowing a movement of the cylinder through certain angles; and

FIG. 14 is a perspective view of a mobile telephone provided inaccordance with one of the embodiments of the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The basic operating concept, which represents a first embodiment of thepresent invention, will now be described with reference to FIG. 7. Amore detailed and practical embodiment will be described later.

With reference to FIGS. 6 and 7, the first embodiment of the presentinvention provides an image processing method and system which providesfor the processing of a first image 62 in order to provide a second,processed, image 72. Conceptually the image processing performed by thepresent invention is illustrated in FIGS. 6 and 7 and can be describedas follows.

In FIG. 6 the first image to be processed, image 62, is provided at acertain position and orientation with respect to a three dimensionalshape, being in this case the cylinder 60. The cylinder 60 is a threedimensional virtual shape provided within a virtual space, and the imageis provided at the certain position and orientation located within thevirtual space. In the embodiment the image is provided located withrespect to the cylinder 60 such that the plane of the image is parallelto the axis of the cylinder. It will be further be seen that theorientation of the image is such that the “up and down” axis of theimage as shown is also parallel to the axis of the cylinder 60. Thecylinder 60 is provided with a curved outer service 64.

In operation, as a first step in the processing the image 62 is“textured” onto the surface 64 of the cylinder by projecting the image62 onto the surface 64 from the image position. In performing theprojection it will be seen that the image orientation is maintained butthat the image aspect ratio when applied to the outer surface 64 ischanged slightly due to the curving nature of the surface of thecylinder. By “texturing” we mean that the individual pixel luminance andchrominance values are applied to the surface 64 of the cylinder 60 inaccordance with the projection of the image thereonto, so that itappears as if the surface 64 is “painted” with the projection of theimage 62. It is important to note that the texturing of the surface 64with the projection of the image 62 effectively binds the respectivepixel luminance and chrominance values to the surface 64, such that theimage appears to be affixed thereto. Thus, after the projection andtexturing the surface 64 is fixedly textured with the projection of theimage 62, such that even if the cylinder 60 is moved in any way, theimage texturing on the surface 64 moves therewith.

Having textured the surface 64 with the projection of the image 62 suchthat the projection is bound thereto, the next step in the imageprocessing method is that the cylinder 60 is moved slightly, in thiscase by a rotation about its axis to a second position 60′, as shown inFIG. 7. Because the projection of the image 62 onto the surface 64 istextured thereon such that it is bound thereto, the textured image onthe surface 64 also rotates with the cylinder to a second position 64′as shown in FIG. 7. The movement of the cylinder 60 to the secondposition 60′ may be any movement necessary to obtain the desired effect.In the present embodiment where the desired effect is to try and processthe image to impart a degree of random movement to the head imagecontained therein, then the movement is preferably a rotation about theaxis of the cylinder of less than ten degrees in either direction, andpreferably of no more than one degree. The rotation may be in eitherdirection clockwise or anti-clockwise about the axis.

Within the embodiments of the invention the movement of the shape, i.e.in this case the cylinder 60, are preferably randomly chosen for eachimage to be processed. That is, a random movement is applied to theshape for each image to be processed. As mentioned above, where theshape is a cylinder and the images are images of heads, preferably themovement of the shape is a rotation about the axis thereof of no morethan ten degrees in either direction, but preferably of no more than onedegree, the amount of rotation and the direction being randomly chosenwithin these limits.

Once the shape has been moved, the final step in the image processingmethod of the embodiments of the invention is to take a projection ofthe pixel luminance and chrominance values bound to the textured surface64′ from the surface back to the position in relative space of the firstimage 62. Where the shape has been moved it is necessary to take theprojection from the surface back to the original position of the firstimage in order that the effect of the movement in producing an aspectratio change of the image is achieved. If the textured surface weremerely projected in a direction substantially orthogonal thereto, thenthe resultant image would be identical to the first image 62. However,by performing the projection from the position of the textured surface64′ back to the original image position and with the same orientation, asecond, processed, image 72 formed from the pixel luminance andchrominance values as projected to the first image position is obtainedin the same position and orientation as the first image 62, but with thecontent thereof processed slightly to represent an aspect change due tothe movement of the cylinder 60.

In order to process a sequence of images, the same procedure asdescribed above is followed separately for each image. The movement thatis applied to the shape, being in this case the cylinder 60, ispreferably randomly chosen, and where the shape is a cylinder ispreferably a rotation about the main axis of about no more than onedegree. Such movements of the shape when textured with each imagesimulates the natural movements of a human head when speaking.

Alternatively, rather than the movement of the shape being random, themovement may be linked to a measure of the energy in the speech which isto be reproduced at the same time as the resultant processed image 72.For example, where the energy in the speech is great (i.e. the speech isloud) the movement of the shape may be greater than where the energy inthe speech is relatively low. This will give the effect of the headpresented in the image moving a greater amount when the reproducedspeech is of a louder volume.

Other arrangements for controlling the movement may also be provided.Although energy is one way, another approach is to control the shapemovement via waveform analysis, such that the movement is determined inaccordance with the frequencies included within the speech. As anexample, the speech waveform is subject to Fourier analysis to determinethe frequencies thereof, and the movement of the shape is controlled independence on the found frequencies. Thus, the shape movement can becontrolled so as to move the shape more if there is more energy in thelower frequencies of the speech than the higher frequencies, or viceversa.

In another alternative embodiment, the control of the movement isdetermined not simply upon the sound being reproduced simultaneously,but instead the shape may be moved both before and after soundreproduction is taking place. In this example if someone is shouting andtalking very loud and quickly (i.e. they are angry) then the shape canbe moved to the largest degree (say from −8 degrees to +8 degrees) fromside to side quickly, and then once the loud and quick speech hasfinished the side to side movement of the shape is gradually reduced andslowed down back into its neutral state. Thus the movement of the shapewould continue to occur in a diminishing manner even after the speechhas stopped.

With respect to the shape, whereas in the embodiment previouslydescribed we have the used the cylinder 60, it is not essential that acylinder be used, and FIG. 5 illustrates various shapes which may beused with the present invention. More particularly, FIG. 5 a depicts asphere, which may be used as the shape. In such a case the sphere may berotated in any direction, and not just about its polar axis. It is alsopossible for plural rotations about any access thereof to be appliedsequentially to achieve more complex movements.

FIG. 5 b illustrates a cylinder and preferably the movement appliedthereto is a rotation about the long axis, as previously described.Other rotations may be possible, however.

FIG. 5 c illustrates an ellipsoid. The movements which may be applied tothe an ellipsoid of FIG. 5 c are similar to those which may be appliedto the sphere shown in FIG. 5 a, that is, a rotation about the polaraxis, but also about any other axis thereof. Similarly FIG. 5 d shows anovoid shape. The shape of FIG. 5 d most closely represents that of ahuman head and hence may produce advantageous effects in its use. Themovements which may be applied to the shape of FIG. 5 d are the same asthose which may be applied to the sphere or the ellipsoid, that is arotation about any axis thereof, but preferably about the polar axis.

FIG. 5 e shows a shape which we have termed a “double cylinder” that is,a cylinder the upper and lower surfaces of which are shaped as sectionsof a second cylinder orthogonal to the direction of the first cylinder.Any movement may be applied to this shape, and in particular anyrotation about any axis thereof.

Also, note that it is not essential that the textured shape be moved atall; what is essential is that there is a relative movement of the shapewith respect to the image position to which the projection of thetextured surface is to be made i.e. the image position at which thesecond, processed image is obtained. Thus, in another embodiment theshape may be kept in the same position and the image position to whichthe projection of the textured surface is made moved around the shape.In a further embodiment both the image position and the shape may bemoved, the respective movements being different to give a relativemovement therebetween. In any of the cases described above it is therelative movement of the shape and the image position which results inthe aspect ratio change effect, and hence the relative movement which isimportant.

Where the image position is moved, the movements which may be applied tothe image are the corollary of those described above for the shape, e.g.a rotation about one or more axes of the shape (note not of the image).

Having described a first embodiment of the present invention whichillustrates the basic operating concept thereof, a more detailed,preferred, embodiment will now be described which embodies the inventionwithin a mobile telephone to allow the life like reproduction of animage of the sender speaking a text message which is to be delivered.

FIG. 14 illustrates a mobile telephone according to the secondembodiment of the present invention. More particularly, a mobiletelephone 100 is provided, which is itself provided with a displayscreen 102 and a keypad 104 to allow data entry by the user. An audioreproducing means 108, being an audio transducer, is further provided.When a text message is received at the mobile telephone 100 an image 106of the sender of the text message is displayed on the screen 102, theimage being a sequence of visemes which are chosen to match the phonemesbeing reproduced by the audio reproducing means 108 to articulate thereceived text message. In accordance with the second embodiment of theinvention the image 106 has been processed so as to confer on the image)he life like movements observed of a human when speaking. Theprocessing of the image 106 to achieve this effect is described next.

FIG. 1 illustrates a system block diagram of the necessary elementsrequired to perform the image processing located within the mobiletelephone 100. More particularly, within the mobile telephone 100 isprovided a text message receiving means 10 which is arranged to receivea text message, for example in accordance with the short message system(SMS) protocol, and to deliver the text of the received text message toa text buffer 12, wherein it is stored. The text of the message storedin the buffer 12 is read by a parser 14 which acts to parse the text todetermine the particular phonemes and the order of reproduction thereofwhich will be required in order to convert the text into a spokenoutput. The parser 14 therefore takes the text from the text buffer 12as input, acoustically parses the text, and outputs a sequence ofphonemes corresponding to the text to the control means 16. The precisesteps to be performed by the parser in parsing the text to produce thephoneme sequence are known in the art, and in particular with respect toprior art text-to-speech systems such as the BT Laureate system. Therepresentation of the phonemes passed to the control means 16 may be anystandard phoneme representation, such as SAMPA or the like.

Also provided within the mobile telephone 100 is a storage medium 20,which provides a storage area for a phoneme store 210 wherein acousticrepresentations of phonemes are stored, such as in the form of waveformsstored as .WAV files. The storage medium 20 further provides a visemestore 208, which stores a sequence of visemes, suitable for display onthe display 102. As mentioned previously, a viseme is a image of a humanface with a particular lip shape associated with one or more phonemes.

Also provided within the storage medium 20 is a shape store 206, whichstores shape data corresponding to any of the geometrical shapes such asspheres, cylinders, ellipsoids, ovoids, or double cylinders, as shown inFIG. 5. The data representing any particular shape preferably takes theform of the coordinates in three dimensional virtual space of thevertices of a large number of polygons which together make up the shape.The shape surface is further defined by the shape data furthercomprising vertex connection information which specifies how to connectup the vertices to form polygons. A set of vertex point coordinateinformation, and vertex connection information is stored for eachpossible shape, within the shape store 206.

Also stored within the storage medium 20 is an operating system program202 which provides the necessary functionality and protocols for thecontrol means 16 to be able to control and communicate with the variousother system elements, as well as an animation program 204 whichspecifically controls the operation of the control means to control thevarious other elements to perform the present invention. Preferably, thestorage medium 20 provided within the mobile phone 100 is a solid statestorage medium, a FLASH multimedia card, or the like.

Also provided within the mobile telephone 100 is an image processor 18,which is controlled by the control means 16 in accordance withinstructions contained within the animation program 204, and which isfurther arranged to access both the viseme store 208 to obtain visemeimages therefrom, and the shape store 206 in order to obtain datarepresenting 3D geometric shapes therefrom. The image processor 18 iscapable of reading the 3D shape data, in order to model the shapetherein. The image processor 18 is arranged to output processed imagesto the display means 102 for display thereon.

Also provided within the mobile telephone 100 is the audio reproducingmeans 108, which are arranged to receive phonemes waveforms from thephoneme store 210 and to reproduce the phoneme waveforms in the receivedorder to reproduce the speech represented thereby.

The phoneme store 210, the viseme store 208, and the image processor 18,are all under the control of the control means 16 which maintainssynchronisation therebetween. More particularly, the control means 16controls the phoneme store 210 to output phonemes to the audioreproducing means 108 in sequence to reproduce the received text messageas speech. Similarly, the control means 16 controls the viseme store 208to cause the viseme store 208 to output the correct viseme image to theimage processor 18 corresponding to a particular phoneme to bereproduced which is to be output from the phoneme store 210. The imageprocessor 18 processes the image received from the viseme store andoutputs the processed image to the display means 102. By maintainingcontrol of the phoneme store, viseme store, and the image processor 18centrally with the control means, it is possible to maintainsynchronisation betwpeen the processed visemes being output from theimage processor 18 and displayed at the display means 102, and thephoneme waveforms output from the phoneme store 210 for reproduction bythe audio reproducing means 108, such that the appropriate processedviseme is displayed on the display means 102 at the same time as thecorresponding phoneme waveform is being reproduced by the audioreproducing means 108.

FIG. 2 illustrates the system functions performed by the image processor18. More particularly, within the image processor 18 is provided acontroller 8, which receives control signals from the control means 16.Control lines shown as dotted lines in FIG. 2 are provided from thecontroller 8 to each of a texturing means 2, a shape modelling means 4,and a projector means 6. The texturing means 2 receives at its inputsignals representing images output from the viseme store 208. The shapemodelling means 4 receives at its input the data representing the threedimensional shapes from the shape store 206. The shape modelling means 4is capable of interpreting the received representative data to virtuallymodel the shape represented by the data, such as a cylinder, sphere,ellipsoid, ovoid, or the like.

Further provided within the image processor 18 is a projector means 6which receives information from the shape modelling means, and acts toperform a virtual projection of the textured surface of the shape to animage position to obtain a second processed image. The projector means 6therefore outputs the processed image from the processor 18.

Having described the conceptual internal structure of the mobiletelephone 100 in accordance with the present invention, the operation ofthe various elements will now be described with respect to FIGS. 3 and4. More particularly, FIG. 3 represents a flow diagram of the operationof the overall system, whereas FIG. 4 is a flow diagram of the stepsspecifically performed by the image processor 18 in accordance with thepresent invention.

With reference to FIGS. 3 and 4, at step 3.2 a text message is receivedby the text message receiver means 10. Following this, at step 3.4 thetext message is stored in the text buffer 12, and then at step 3.6 theparser 14 acts to read the text message as required from the text buffer12. The parser 14 parses the text of the message to obtain the phonemerepresentation thereof at step 3.8. As discussed previously, parsing oftext to obtain a phoneme representation is known in the art.

The phoneme representation obtained by the parser is passed to thecontrol means 16, which at step 3.10 controls the phoneme store to readthe acoustic waveform of the first phoneme in the sequence received fromthe parser therefrom. At substantially the same time, at step 3.12 thecontrol means controls the viseme store to output from the viseme store208 the corresponding viseme to the phoneme being presently output fromthe phoneme store 210. The phoneme output from the phoneme store 210 ispassed to the audio reproducing means 108, whereas the viseme outputfrom the viseme store 208 is passed to the image processor 18.

At step 3.14 the image processor 18 processes the received viseme inaccordance with the present invention in order to “animate” the image byapplying the image processing method of the present invention thereto.This produces the effect on the viseme as if the head represented in theviseme has moved slightly.

Following the processing of the input viseme at the image processor 18,at step 3.16 the processed viseme is output to the display means 102where it is then displayed to the user. At substantially the same time,at step 3.18 the audio reproducing means 108 plays the acoustic waveformof the present phoneme as received from the phoneme store 210.

Once the present processed viseme has been displayed and the presentphoneme played, an evaluation is undertaken at step 3.20 to determinewhether all the phonemes in the sequence determined by the parser 14have been played. If the evaluation determines that all the phonemeshave been played then the procedure ends. In contrast, if it isdetermined that there remain phonemes to be played, and associatedvisemes to be displayed, then at step 3.22 the controls means 16 movesonto the next phoneme in the sequence and processing returns to step3.10. The processing steps of step 3.10, step 3.12, step 3.14, step3.16, step 3.18, step 3.20, and step 3.22 are then repeated in a loopuntil all of the phonemes output from the parser 14 to the control means16 have been played, and the associated visemes processed and displayedto the user.

Having described the overall operation of the system, the specific stepsperformed by the image processor 18 when performing step 3.14 will nowbe described with respect to FIG. 4.

At step 4.2, the image processor 18 receives a viseme output from theviseme store 208 under the control of the control means 16. Followingthis, at step 4.4 the image processor 18 accesses the shape store 206and retrieves the data representative of the 3D shape to which thereceived image is to be applied. The shape modelling means 4 provided inthe image processor 18 receives the data from the shape store, and usesthe data to model the shape as a three dimensional geometric shape invirtual space. Next, the texturing means acts at step 4.6 to texture thesurface of the shape with a projection of the received image. Bytexturing we mean that the luminance and chrominance values of the imagepixels as projected onto the surface of the shape are effectively boundto the surface, such that the surface is effectively “painted” with theimage projection, as previously described.

Following the shape texturing, at step 4.8 the shape modelling means 4acts to perform a movement of the shape and models the textured shape inthe moved position relative to the image position. As describedpreviously in respect of the first embodiment, the movement may be arotation about any axis of the shape, and in the preferred embodimentthe shape is a cylinder and the movement is a rotation about the mainaccess thereof by no more than ten degrees, and preferably by no morethan one degree in either direction. The precise movement applied to thetextured shape is preferably randomly chosen for each iteration of theprocess, with the result that each processed image has had a differentshape movement applied thereto.

In alternative embodiments, as discussed previously, the movement of theshape may be controlled in response to the speech energy in the phonemewhich is to be reproduced with the processed viseme.

Furthermore, in other alternative embodiments it is the image positionwhich is moved relative to the shape, or a combination of both shape andimage position relative movement, as discussed previously. As in thepreviously described embodiment, it is the relative movement between theshape and image position of the processed image which is important, notthe absolute movements of each.

Following step 4.8, that is after the textured shape has been modelledin the moved position, at step 4.10 the projector means 6 obtainsinformation of the moved textured shape from the shape modelling means 4and acts to take a projection of the textured surface of the shape tothe position in virtual space of the original received image. The image(comprising the projected pixel luminance and chrominance values)obtained by such a projection located at the position of the originalreceived image is then output by the projector means as the output fromthe image processor 18 to the display means 102, at step 4.12. Thedisplay means 102 then displays the processed image to the user, asdescribed previously.

In accordance with the foregoing, it should therefore be understood thatthe present invention provides an image processing method and systemwhich allows for images to be processed to produce a particular “aspectratio change” effect by virtue of the image being applied to a threedimensional shape, a relative movement then being effected between theshape and an image position, and a projection of the applied image thenbeing taken to the image position to obtain a second image. Where theimages are visemes, then the relative movement of the shape with respectto the image position to achieve the aspect ratio change effect withinthe resultant image simulates the small head movements observed of realhumans when talking. Therefore, the wooden, lifeless nature of previousviseme images is alleviated, and the overall effect is of a morelifelike image.

It should be noted, that whilst the preferred embodiment of theinvention uses the image of a human head, the invention is not limitedto processing images of human heads, and images of animal heads, fantasyheads or the like may be used. In particular, by “fantasy heads” we meanimages of the heads of fantasy creatures such as, for example, those ofthe Teletubbies™ as shown on BBC Television, or those of characters fromfantasy films such as Star Wars® or the like. Moreover, it should befurther understood that whilst the invention has been specificallydeveloped for processing images of faces (whether, human, animal,fantasy or otherwise), its use is not limited to the processing offacial images, and any image which requires an aspect ratio change aspreviously described and provided by the invention may be processedthereby.

Within the embodiments previously described we have described theinvention on a conceptual basis by virtue of the fact that the image isprojected onto a shape, the shape then rotated, and then the projectiontaken from the shape back to the original image position. Whilst in apreferred implementation of the invention the concept as alreadydescribed may be maintained, and especially using specialistthree-dimensional virtual reality programming languages as are known inthe art, it should also be understood that the same image processingeffects can be obtained via a purely mathematical algorithm. That is,the image to be textured onto the shape can be considered to be simplyan array of points located at discrete co-ordinates in a knownco-ordinate system, and the shape instead of being represented bypolygons as in the preferred embodiment, may instead merely berepresented by an appropriate equation in the known coordinate system.The resultant processed image can then be mathematically obtained usingthe equation for the geometric shape, and applying the appropriatetransform to the image coordinates using the shape equation. Such animplementation is clearly intended to be covered by the appended claims,in that it really does no more than embody the concept of the presentinvention as already described in respect of the preferred embodiments.For completeness, therefore, there follows a mathematical analysis ofthe image processing method according to the present invention, whichrepresents in mathematical terms the basic, projection, relativemovement, and second projection steps already described. Themathematical analysis of the present invention will be described, withreference to FIGS. 8 to 13.

FIG. 8 shows a perspective view of the three-dimensional virtual spacein which the image and shape objects of the present invention exist. Inparticular, FIG. 8 illustrates the respective positioning andorientation of the image to be processed and the virtual shape modelwithin the virtual space.

FIG. 9 illustrates the image to be processed from a viewpoint directlyin front, i.e. orthogonal to the plane of the image. Conversely, FIG. 10shows a view taken from above in the direction of the axis of thecylinder. In this case the view is along the image plane, such thatimage is no longer evident from this view-point (due to it beingtwo-dimensional). FIGS. 11, 12, and 13 illustrate a similar view.

Where the shape is a cylinder it is only necessary to do thecalculations for a circle at any point on the surface of the cylinder,for example at the level of AB. The resultant formula will then be thesame the whole way up or down the vertical dimension of the cylinder, asthe cross-section in this direction does not vary.

Firstly, calculate the luminance of the head textured onto the cylinder,that is L_(C). The only requirement is that when the cylinder is facingexactly forward it must project on to the screen to appear identical tothe luminance of the original still picture—Ls, say.

With reference to FIG. 11, let

be the angle subtended by the point of interest in the image.

is ‘fixed’ to the cylinder.

Since x=r sin θL _(C)(

)=L _(S)(x)=L _(S)(r sin

)

L _(C)(

)=L _(S)(x)=L _(S)(r sin

)   Eq 1

Next, with reference to FIG. 12, Calculate the luminance displayed onthe screen (L_(D)) when the cylinder is rotated at an angle α

Here, let L_(CR) be the luminance of the rotated cylinder (at α), thenL _(D)(x)=L _(CR)(r sin α) (See FIG. 12); butL _(CR)(α)=L _(CR)(

+θ) (See FIG. 13)where α=

+φ, since it is an angle of rotation φ plus angle of displacement

i.e.L _(CR)(

+θ)=L _(C)(

)Angle of displacement is

Angle of rotation is φNow, from Eq 1: L _(C)(

)=L _(S)(r sin

)But: r sin α=xTherefore:

+φ=arcsin(x/r) $\begin{matrix}{\left. \Rightarrow\vartheta \right. = {{\arcsin\left( {x/r} \right)} - \phi}} \\{\left. \Rightarrow{L_{C}(\vartheta)} \right. = {L_{S}\left\{ {r\quad{\sin\left( {{\arcsin\left( {x/r} \right)} - \phi} \right)}} \right\}}} \\{\left. \Rightarrow{L_{C}(\vartheta)} \right. = {L_{S}\left\{ {{x\quad\cos\quad\phi} - {\sqrt{r^{2} - x^{2}}\sin\quad\phi}} \right\}}} \\\left. \Rightarrow{= {L_{D} = {L_{S}\left\{ {{x\quad\cos\quad\phi} - {\sqrt{r^{2} - x^{2}}\sin\quad\phi}} \right\}}}} \right.\end{matrix}$

Thus, it will be seen from the foregoing that the present invention canbe embodied in either the conceptual basis as described in respect ofthe preferred embodiment, or on a purely mathematical basis as shownabove, and that the two are not mutually exclusive. The same effect ofobtaining a processed image which has undergone an “aspect ratio change”is obtained in either case.

Furthermore, the invention should not be taken to being limited toimplementation through electronic or other processing means, as it isquite possible to implement the invention and achieve the same effectsthrough an actual physical implementation of the invention. Therefore,in further embodiments there is provided an image projector arranged todisplay an image via projection onto a surface of a shape. The imageprojector may be a slide projector, or a digital light projector, or thelike. A three-dimensional shape made, for example, of plastic orpolystyrene and preferably light-coloured is further provided,positioned relative to the image projector to receive the projection ofthe image on a surface thereof. A camera, which may be digital, video,or film-based or the like, is further provided focused on the projectionof the image on the shape surface, and arranged to capture imagesthereof. The camera is preferably provided adjacent to but not exactlyco-located with the projector, such that the camera optical axis is at aslight angle (of preferably no more than ten degrees, and preferably nomore than 1 degree) to the optical axis of the projector. Such alocation of the camera simulates the movement of the image positionrelative to the shape, without there having to be any actual movement ofany of the projector, shape, or camera.

The operation of such an arrangement is straightforward, and similar tothe previously described embodiments. That is, the projector acts toproject an image, being preferably a viseme or the like, onto thesurface of the shape, thereby effectively “texturing” the surface. Thecamera, being focused on the surface of the shape, captures an image ofthe projected image, which by virtue of the optical axis of the camerabeing at an angle to the optical axis of the projector exhibits the“aspect ratio change” effect provided by the invention. The imagecaptured by the camera is then output as the processed image.

For a succession of images to be processed the position of the cameraand/or the projector relative to each other are altered, such that anapparently random aspect ratio change is observed for each successiveimage.

Unless the context clearly requires otherwise, throughout thedescription and the claims, the words “comprise”, “comprising” and thelike are to be construed in an inclusive as opposed to an exclusive orexhaustive sense; that is to say, in the sense of “including, but notlimited to”.

1. An image processing method comprising the steps of: a) texturing atleast one surface of a three-dimensional shape with a projection of afirst image from an image position and orientation thereof onto said atleast one surface; b) moving one or both of the shape and/or the imageposition relative to each other; and c) projecting the textured surfaceof the shape to the image position to obtain a second image at the imageposition and in the same orientation as said first image.
 2. A methodaccording to claim 1, wherein said first image forms part of a sequenceof first images, the method further comprising repeating steps a), b),and c) for each first image in said sequence to obtain a correspondingsequence of second images.
 3. A method according to claim 2, wherein therespective sequences of first and second images each form an animatedsequence of a face speaking.
 4. A method according to claims 2, andfurther comprising displaying said second sequence of second images to auser, and reproducing recorded or synthesised sound, which may bespeech, in synchrony with said display.
 5. A method according to claim4, wherein the moving step further comprises moving the shape and/or theimage position in dependence on the energy in the reproduced sound.
 6. Amethod according to claim 1, wherein the moving step further comprisesrandomly moving the shape and/or the image position.
 7. A methodaccording to claim 1, wherein the moving step further comprises rotatingsaid three-dimensional shape about one or more axes thereof.
 8. A methodaccording to claim 7, wherein said rotation is no more than 10 degreesin either direction about one or more of said axes, and preferably nomore than 1 degree in either direction about said axes.
 9. A methodaccording to claim 1, wherein the shape is one of a group consisting of:a sphere, a cylinder, an ellipsoid, an ovoid, or a double-cylinder. 10.A method according to claim 1, wherein said first image is an imagecomprising a plurality of pixels located at the first position in avirtual space, and said shape is data representative of a 3D virtualmodel of a shape located in said virtual space, modelled by a processor.11. A computer program which when executed by a computer system causesthe computer system to perform the method of claim
 1. 12. Acomputer-readable storage medium storing a computer program according toclaim
 11. 13. An image processing system comprising: image receivingmeans for receiving a first image to be processed; image processingmeans; and image output means for outputting a second, processed image;characterised in that the image processing means further comprises:shape modelling means arranged to model a three-dimensional shape; andis further arranged to: a) texture at least one surface of athree-dimensional shape with a projection of a first image from an imageposition and orientation thereof onto said at least one surface; b) moveone or both of the shape and/or the image position relative to eachother; and c) project the textured surface of the shape to the imageposition to obtain a second image at said position and in the sameorientation as said first image.
 14. A system according to claim 13,wherein said first image forms part of a sequence of first images, thesystem being further arranged to receive each first image in saidsequence at said image receiving means, said image processing meansbeing arranged to repeat steps a), b), and c) for each first image insaid sequence to obtain a corresponding sequence of second images.
 15. Asystem according to claim 14, wherein the respective sequences of firstand second images each form an animated sequence of a face speaking. 16.A system according to claim 14, and further comprising display means fordisplaying said second sequence of second images to a user, and soundreproduction means for reproducing recorded or synthesised sound, whichmay be speech, in synchrony with said display.
 17. A system according toclaim 16, wherein the image processing means is further arranged to movethe shape and/or the image position in dependence on the energy in thereproduced sound.
 18. A system according to claim 13, wherein the imageprocessing means is further arranged to move shape and/or the imageposition randomly.
 19. A system according to claim 13, wherein the imageprocessing means is further arranged to move the shape by rotating saidthree-dimensional shape about one or more axes thereof.
 20. A systemaccording to claim 19, wherein said rotation is no more than 10 degreesin either direction about one or more of said axes, and preferably nomore than 1 degree in either direction about said axes.
 21. A systemaccording to claim 13, wherein the shape is one of a group consistingof: a sphere, a cylinder, an ellipsoid, an ovoid, or a double-cylinder.22. A system according to claim 13, wherein said first image is an imagecomprising a plurality of pixels located at said first position in avirtual space, and said shape is data representative of a 3D virtualmodel of a shape located in said virtual space, modelled by said shapemodelling means.